Hooman Sabarou & Mounika Chevva (Advisor: Dr. Seals)
Martensite Starting Temperature
Application
| Variable | Min | Max | Mean | Median | SD |
|---|---|---|---|---|---|
| Ms (Martensite Start Temp) | 310.00 | 784.00 | 601.80 | 605.00 | 120.00 |
| C (Carbon) | 0.00 | 1.46 | 0.36 | 0.33 | 0.10 |
| Mn (Manganese) | 0.00 | 4.95 | 0.79 | 0.69 | 0.30 |
| Ni (Nickel) | 0.00 | 27.20 | 1.56 | 0.15 | 0.50 |
| Si (Silicon) | 0.00 | 3.80 | 0.35 | 0.26 | 0.20 |
| Cr (Chromium) | 0.00 | 16.20 | 1.04 | 0.52 | 0.70 |
Untransformed Model: Directly modeled Ms using predictors like C, Mn, Ni, Si, Cr, with interaction terms.
Log-Transformed Model: Modeled log(Ms) to handle non-normality and stabilize variance, using the same predictors and interaction terms.
Model Improvements (Predictors’ Removal, Introducing Interaction Parameters, Outliers’ Removal)
Model Diagnostics (ANOVA, AIC, Cross-Validation, Check for Multicollinearity, Influential Points’ Removal)
Model Evaluation: The log-transformed model showed significantly better performance with a lower AIC and cross-validation MSE. Residual deviance and cross-validation confirmed that the log model generalized better to unseen data.
Ms = 769.41 -286.71 C -16.42 Mn -14.04 Ni - 13.89 Si - 10.13Cr -41.45C:Mn - 8.36 C:Ni
| Variables | Mean ± SD | Correlation Coefficient | P-value |
|---|---|---|---|
| C | 0.36 ± 0.1 | -286.71 | < 2e-16 |
| Mn | 0.79 ± 0.3 | -16.42 | 1.36E-13 |
| Ni | 1.55 ± 0.5 | -14.04 | < 2e-16 |
| Si | 0.35 ± 0.2 | -13.89 | 1.70E-13 |
| Cr | 1.04 ± 0.7 | -10.13 | < 2e-16 |
| C:Mn | N/A | -41.45 | < 2e-16 |
| C:Ni | N/A | -8.36 | 9.68E-10 |
AIC:13545
BIC:1080321
R^2: 0.9016 (90.16%)
Adjusted R^2: 0.9010 (90.10%)
log(Ms) = -6.69 - 0.51C - 0.03 Mn - 0.03 Ni - 0.03 Si - 0.02Cr - 0.07 C:Mn - 0.01C:Ni
| Variables | Mean ± SD | Correlation Coefficient | P-value |
|---|---|---|---|
| C | 0.36 ± 0.1 | -0.51 | < 2e-16 |
| Mn | 0.79 ± 0.3 | -0.032 | < 2e-16 |
| Ni | 1.55 ± 0.5 | -0.0255 | < 2e-16 |
| Si | 0.35 ± 0.2 | -0.0226 | 4.48E-13 |
| Cr | 1.04 ± 0.7 | -0.0175 | < 2e-16 |
| C:Mn | N/A | -0.0751 | < 2e-16 |
| C:Ni | N/A | -0.0154 | 1.01E-11 |
| Model | AIC | BIC | R² |
|---|---|---|---|
| I (Basic) | 15699 | 2984481.84 | 0.753 |
| II (Remove C=0) | 15010 | 2506169.56 | 0.788 |
| III (Remove Co, Mo-C:Mn) | 14984 | 2465894.55 | 0.791 |
| IV (Remove V, C:Ni) | 14935 | 2384283.55 | 0.798 |
| V (Log Model) | -3578 | 72.79 | 0.808 |
| VI (Influential Points Removal) | 14751 | 2142102.54 | 0.816 |
| VII (Influential Points Removal-Log) | -3733.4 | 71.99 | 0.826 |
| VIII (Outliers Removal) | 13545 | 1080328.38 | 0.902 |
| IX (Outlier Removal-Log) | -4756.5 | 68.34 | 0.914 |
”
Two kinds of cross-validation methods have been conducted: k-Fold and the Leave-One-Out Cross-Validation (LOOCV)
Interpretation
Both the 5-fold and 10-fold cross-validation results for the log-transformed model are extremely close, with very little variation between the fold types. This suggests that the log-transformed model is highly stable and performs consistently across different subsets of the data.
The cross-validation errors for the log-transformed model (~0.0021) are significantly lower than those of the untransformed model (~774 to ~780). This indicates that the log-transformed model likely fits the data better and generalizes more effectively.
The log-transformed model performs better in terms of LOOCV error, suggesting it is more reliable for prediction.
If the purpose of a model is interpretability or making predictions on the original scale of Ms, the untransformed model may still be relevant despite the higher LOOCV error. However, for optimal prediction accuracy, the log-transformed model is superior based on these results.
The log-transformed model is the preferred choice based on k-fold cross-validation results. It demonstrates both lower prediction error and stability across different folds, making it a robust and accurate model for predicting the Martensite start temperature. Therefore, the log-transformed model should be selected as the final model for this project, as it provides more reliable predictions and handles the underlying data structure more effectively.
The log-transformed model shows a more stable and lower prediction error with LOOCV, supporting its choice as the better model in terms of predictive performance.